<fix>[kvm]: extend reconnect echo timeout after libvirtd restart#3990
<fix>[kvm]: extend reconnect echo timeout after libvirtd restart#3990zstack-robot-2 wants to merge 1 commit into
Conversation
Walkthrough在 KVMHost 连接/部署流程中加入判断部署是否触发 libvirtd 重启的标记,并在 echo-host 步骤使用基于主机活动 VM 数量计算的动态 echo 超时(或回退到默认超时),同时新增工具常量、统计/计算方法及相应单元和集成测试。 变更Libvirtd 重启 Echo 超时自适应
Sequence Diagram(s)sequenceDiagram
participant KVMHost
participant AnsibleDeploy
participant EchoStep
KVMHost->>KVMHost: 初始化 libvirtRestarted=false
KVMHost->>AnsibleDeploy: 组装并执行部署(deployArguments)
AnsibleDeploy-->>KVMHost: 返回 deployArguments 设置
KVMHost->>KVMHost: 根据 deployArguments 计算 libvirtRestarted
KVMHost->>EchoStep: 计算 echoTimeout(默认或基于 vmCount 的延长)
KVMHost->>EchoStep: 调用 restf.echo(..., timeout=echoTimeout)
预计审查工作量🎯 3 (Moderate) | ⏱️ ~20 minutes 诗
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ast-grep (0.42.2)plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.javaComment |
|
Comment from yaohua.wu: Review: MR !9883 — ZSTAC-84691变更概述:在 KVM 物理机重连的 结论:APPROVED修复方向正确、范围收敛、与 Jira 设计方案完全一致,未发现阻塞合并的问题。以下为改进建议。 🟡 Warning1. 新增的两个纯函数缺少单元测试 —
本 issue 标记为 5.5.22「必须解」,建议补齐测试再合入。 🟢 Suggestion2. 超时调参常量硬编码 —
3. 确认扩展点驱动的 libvirtd 重启路径 —
已核对项(无问题)
🤖 Robot Reviewer |
24ef2e4 to
719a2aa
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
plugin/kvm/src/main/java/org/zstack/kvm/KVMHostUtils.java (1)
37-39: 🏗️ Heavy lift建议将超时调参常量改为可配置项
这三个值目前是编译期常量,现场调优需要改代码发版。建议改成 GlobalProperty(保留当前默认值),便于不同规模主机按需调整而不改变默认行为。
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMHostUtils.java` around lines 37 - 39, Replace the three compile-time constants in KVMHostUtils (LIBVIRT_RESTART_ECHO_TIMEOUT_VM_THRESHOLD, LIBVIRT_RESTART_ECHO_TIMEOUT_PER_VM_SECONDS, LIBVIRT_RESTART_ECHO_TIMEOUT_MAX_SECONDS) with GlobalProperty-backed configuration entries (keeping the current numeric values as defaults) and update all usages in KVMHostUtils to read the values via the GlobalProperty accessor rather than the static constants; name the properties clearly (e.g., libvirt.restart.echo.timeout.vmThreshold, .perVmSeconds, .maxSeconds), provide the existing defaults, add any necessary `@GlobalProperty` or registration so they are injectable/readable at runtime, and ensure unit tests or callers that referenced the static fields now obtain the values from the new GlobalProperty API.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMHostUtils.java`:
- Around line 154-156: The boolean-string check in
shouldRestartLibvirtdDuringDeploy can mis-evaluate values with surrounding
whitespace; update the method to trim both parameters (init and restartLibvirtd)
before comparing (e.g., call trim() safely after null-checking each param) and
then use equalsIgnoreCase on the trimmed strings so values like " true " or
"\ntrue\t" are treated as true.
---
Nitpick comments:
In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMHostUtils.java`:
- Around line 37-39: Replace the three compile-time constants in KVMHostUtils
(LIBVIRT_RESTART_ECHO_TIMEOUT_VM_THRESHOLD,
LIBVIRT_RESTART_ECHO_TIMEOUT_PER_VM_SECONDS,
LIBVIRT_RESTART_ECHO_TIMEOUT_MAX_SECONDS) with GlobalProperty-backed
configuration entries (keeping the current numeric values as defaults) and
update all usages in KVMHostUtils to read the values via the GlobalProperty
accessor rather than the static constants; name the properties clearly (e.g.,
libvirt.restart.echo.timeout.vmThreshold, .perVmSeconds, .maxSeconds), provide
the existing defaults, add any necessary `@GlobalProperty` or registration so they
are injectable/readable at runtime, and ensure unit tests or callers that
referenced the static fields now obtain the values from the new GlobalProperty
API.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)
Review profile: CHILL
Plan: Pro
Run ID: 9a9b6257-b724-40cc-ba7a-66ed88f82483
📒 Files selected for processing (3)
plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.javaplugin/kvm/src/main/java/org/zstack/kvm/KVMHostUtils.javatest/src/test/java/org/zstack/test/kvm/KVMHostUtilsTest.java
| public static boolean shouldRestartLibvirtdDuringDeploy(String init, String restartLibvirtd) { | ||
| return "true".equalsIgnoreCase(init) || "true".equalsIgnoreCase(restartLibvirtd); | ||
| } |
There was a problem hiding this comment.
布尔字符串判断建议先 trim,避免误判
Line 155 当前仅做 equalsIgnoreCase,若参数值包含前后空格(如 " true ")会返回 false,可能导致本应延长的 echo 超时未生效。
建议修改
public static boolean shouldRestartLibvirtdDuringDeploy(String init, String restartLibvirtd) {
- return "true".equalsIgnoreCase(init) || "true".equalsIgnoreCase(restartLibvirtd);
+ String normalizedInit = init == null ? null : init.trim();
+ String normalizedRestartLibvirtd = restartLibvirtd == null ? null : restartLibvirtd.trim();
+ return "true".equalsIgnoreCase(normalizedInit) || "true".equalsIgnoreCase(normalizedRestartLibvirtd);
}As per coding guidelines: 注意检查来自 Message 的参数是否做过 trim,用户可能在浏览器上复制粘贴的数据带有空格、换行符等。
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@plugin/kvm/src/main/java/org/zstack/kvm/KVMHostUtils.java` around lines 154 -
156, The boolean-string check in shouldRestartLibvirtdDuringDeploy can
mis-evaluate values with surrounding whitespace; update the method to trim both
parameters (init and restartLibvirtd) before comparing (e.g., call trim() safely
after null-checking each param) and then use equalsIgnoreCase on the trimmed
strings so values like " true " or "\ntrue\t" are treated as true.
|
|
||
| long vmCount = Q.New(VmInstanceVO.class) | ||
| .eq(VmInstanceVO_.hostUuid, self.getUuid()) | ||
| .count(); |
There was a problem hiding this comment.
Comment from yaohua.wu:
!=Stopped?
| Assert.assertFalse(KVMHostUtils.shouldForceTlsRedeploy(true, false, false)); | ||
| } | ||
|
|
||
| @Test |
There was a problem hiding this comment.
Comment from yaohua.wu:
应该要有个集成测试,这个UT在PR系统里不跑的
|
Comment on Comment from yaohua.wu: 在这里加是否发生了deploy以及libvirt重启, libvirtRestarted = deployArguments.getRestartLibvirtd() |
|
|
||
| @Override | ||
| public void run(final FlowTrigger trigger, Map data) { | ||
| final long echoTimeout = getKvmAgentEchoTimeoutAfterDeploy(deployed, restartLibvirtdDuringDeploy); |
There was a problem hiding this comment.
Comment from yaohua.wu:
final long echoTimeout = libvirtRestarted ? getLibvirtRestartedEchoTimeout() : TimeUnit.SECONDS.toMillis(CoreGlobalProperty.REST_FACADE_ECHO_TIMEOUT)
Resolves: ZSTAC-84691 Change-Id: I6668736975707575756d7665647867686e6b6b76
719a2aa to
e7851c6
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
test/src/test/groovy/org/zstack/test/integration/kvm/host/LibvirtRestartEchoTimeoutCase.groovy (2)
68-78: ⚡ Quick win建议显式断言被测 VM 初始状态前提,提升用例稳定性。
当前断言隐含“该 VM 初始应被计入统计”。建议在更新前增加状态前置断言(例如非
Stopped),避免后续夹具调整引入脆弱失败。可选修改示例
VmInstanceState originalState = Q.New(VmInstanceVO.class) .select(VmInstanceVO_.state) .eq(VmInstanceVO_.uuid, vm.uuid) .findValue() + assert originalState != VmInstanceState.Stopped try { SQL.New(VmInstanceVO.class) .eq(VmInstanceVO_.uuid, vm.uuid) .set(VmInstanceVO_.state, VmInstanceState.Stopped) .update()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/src/test/groovy/org/zstack/test/integration/kvm/host/LibvirtRestartEchoTimeoutCase.groovy` around lines 68 - 78, 在更新 VM 状态前显式断言被测 VM 的初始状态以避免用例脆弱性:在读取 originalState (通过 Q.New(VmInstanceVO.class).select(VmInstanceVO_.state).eq(VmInstanceVO_.uuid, vm.uuid).findValue()) 后,添加断言例如 assert originalState != VmInstanceState.Stopped 来保证该 VM 初始被计入统计;同时可在使用 KVMHostUtils.countVmsForLibvirtRestartEchoTimeout(host.uuid) 前断言其等于 originalCount 以明确前置条件,确保后续对 SQL.New(...).set(..., VmInstanceState.Stopped).update() 的影响可被可靠验证。
46-57: ⚡ Quick win为修改全局超时值的测试添加互斥保护,避免并发冲突。
代码中多个测试(包括
KVMHostUtilsTest.java和本文件)都会临时修改CoreGlobalProperty.REST_FACADE_ECHO_TIMEOUT,但未使用任何同步机制。若测试并行执行,将导致修改相互干扰。建议将配置修改、断言验证及恢复操作放在同一互斥区间内(例如synchronized (CoreGlobalProperty.class))以增强鲁棒性。🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/src/test/groovy/org/zstack/test/integration/kvm/host/LibvirtRestartEchoTimeoutCase.groovy` around lines 46 - 57, Tests modify CoreGlobalProperty.REST_FACADE_ECHO_TIMEOUT without synchronization, causing race conditions when tests run in parallel; wrap the block that changes, asserts, and restores REST_FACADE_ECHO_TIMEOUT in a mutual exclusion on CoreGlobalProperty.class (e.g., synchronized(CoreGlobalProperty.class)) so the sequence in LibvirtRestartEchoTimeoutCase (the try/ finally that sets and restores REST_FACADE_ECHO_TIMEOUT and calls KVMHostUtils.calculateLibvirtRestartEchoTimeoutMillis) cannot interleave with other tests (like KVMHostUtilsTest.java) that also mutate the same static property.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In
`@test/src/test/groovy/org/zstack/test/integration/kvm/host/LibvirtRestartEchoTimeoutCase.groovy`:
- Around line 68-78: 在更新 VM 状态前显式断言被测 VM 的初始状态以避免用例脆弱性:在读取 originalState (通过
Q.New(VmInstanceVO.class).select(VmInstanceVO_.state).eq(VmInstanceVO_.uuid,
vm.uuid).findValue()) 后,添加断言例如 assert originalState != VmInstanceState.Stopped
来保证该 VM 初始被计入统计;同时可在使用
KVMHostUtils.countVmsForLibvirtRestartEchoTimeout(host.uuid) 前断言其等于
originalCount 以明确前置条件,确保后续对 SQL.New(...).set(...,
VmInstanceState.Stopped).update() 的影响可被可靠验证。
- Around line 46-57: Tests modify CoreGlobalProperty.REST_FACADE_ECHO_TIMEOUT
without synchronization, causing race conditions when tests run in parallel;
wrap the block that changes, asserts, and restores REST_FACADE_ECHO_TIMEOUT in a
mutual exclusion on CoreGlobalProperty.class (e.g.,
synchronized(CoreGlobalProperty.class)) so the sequence in
LibvirtRestartEchoTimeoutCase (the try/ finally that sets and restores
REST_FACADE_ECHO_TIMEOUT and calls
KVMHostUtils.calculateLibvirtRestartEchoTimeoutMillis) cannot interleave with
other tests (like KVMHostUtilsTest.java) that also mutate the same static
property.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)
Review profile: CHILL
Plan: Pro
Run ID: d4e15424-073c-474a-976e-14a87ab53830
📒 Files selected for processing (4)
plugin/kvm/src/main/java/org/zstack/kvm/KVMHost.javaplugin/kvm/src/main/java/org/zstack/kvm/KVMHostUtils.javatest/src/test/groovy/org/zstack/test/integration/kvm/host/LibvirtRestartEchoTimeoutCase.groovytest/src/test/java/org/zstack/test/kvm/KVMHostUtilsTest.java
🚧 Files skipped from review as they are similar to previous changes (1)
- test/src/test/java/org/zstack/test/kvm/KVMHostUtilsTest.java
Resolves: ZSTAC-84691
Change-Id: I6668736975707575756d7665647867686e6b6b76
sync from gitlab !9883